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Abstract — Compressed sensing (CS) demonstrates that sparse 
signals can be recovered from underdetermined linear measure- 
ments. We focus on the joint sparse recovery problem where 
multiple signals share the same common sparse support sets, and 
they are measured through the same sensing matrix. Leveraging a 
recent information theoretic characterization of single signal CS, 
we formulate the optimal minimum mean square error (MMSE) 
estimation problem, and derive a belief propagation algorithm, 
its relaxed version, for the joint sparse recovery problem and 
an approximate message passing algorithm. In addition, using 
density evolution, we provide a sufficient condition for exact 
recovery. 

I. Introduction 

Compressed sensing (CS) |[T|, ||2 has revolutionalized 
sparse signal processing from underdetermined linear mea- 
surements. CS offers a sharp contrast to the traditional sensing 
and processing paradigm that first sample the entire data at the 
Nyquist rate, only to later throw away most of the coefficients. 
Owing to the potential for reduced measuring rates, CS has 
become an active research area in signal processing. 

An important area in compressed sensing research is known 
as distributed CS [3|. Distributed CS is based on the premise 
that joint sparsity within signal ensembles enables a further 
reduction in the number of measurements. Motivated by 
sensor networks [4], preliminary work in distributed CS [3 |, 
||5l showed that the number of measurements required per 
sensor must account for the minimal features unique to that 
sensor, while features that are common to multiple sensors are 
amortized among sensors. Distributed CS led to a proliferation 
of research on the multiple measurement vector problem 
(MMV) lO-lSl. The MMV problem considers the recovery 
of a set of sparse signal vectors that share common non-zero 
supports through an identical sensing matrix, and ties into 
several applications of interest, such as sensor networks, radar, 
parallel MRI, etc. 

In single signal CS, recent results have established the 
fundamental performance limits in the presence of noise [9|, 
ifTOl . For sparse measurement matrices, belief propagation 
CS reconstruction [11] is asymptotically optimal. Based on 
the revelation that the posteriors in CS signal estimation are 
similar in form to outputs of scalar Gaussian channels ifTOl . 
additional recent results (12], |T3l have demonstrated the 
potential for faster algorithms for implementing BR Another 
recent breakthrough is the discovery of an approximate mes- 
sage passing (AMP) algorithm 1131 . which was originally 



derived as a fast approximation of BP, and is strikingly similar 
to iterative thresholding llT4l while achieving theoretically 
optimal mean square error. 

Leveraging the aforementioned recent progress in the single 
measurement vector CS, this paper extends BP to the MMV 
problem. In particular, we show that BP can be formulated 
as vector message passing by considering the general signal 
correlation structures between input vectors. Next, a relaxed 
BP algorithm is derived based on a Gaussian assumption 
for the messages, and conditions for the edge independent 
CO variance update are rigorously derived. Finally, we provide 
AMP update rules by further removing the edge dependency 
of the mean update. As a byproduct, we provide a sufficient 
condition for exact joint sparse recovery using AMP state 
evolution. 

II. Signal and Measurement Model 

Signal model: We consider a model in which an ensemble 
of J length-iV signals are jointly sparse as follows. Our 
notation for a matrix X uses x" for the ?ith row vector and 
Xj as the jth column vector 



and 



(51) Each signal {yij}j^i belongs to M , i.e., Xj G 
for each n = 1 , • • • ,N, the n-th component Xnj of Xj 
for j = 1, • • • , J is given by x^j = 6„u,y. 

(52) The N random variables {bn}n=i representing support 
are independent and identically distributed (iid) Bernoulli 
random variables (RV's) with probabiUty e of being one 
and 1 — e of being zero. 

(53) The random vectors u" e R"^ are iid random vec- 
tors, which have a multivariate normal distribution 
Af,j{u; 0, A) with zero mean and covariance matrix A, 

eA(7((u")^;0,A) + (l-6)<5((u")^) . 



/u(u") 



Measurement model: For each x^, a measurement vector 
Yj E M.^' containing AI noisy linear measurements is derived 
by multiplying the signal Xj by a measurement matrix €> e 
jjMxTV^ and adding noise zj, 



$x. 



where the noise vector Zj is i.i.d zero mean Gaussian with 
noise variance cr^. Following the terminology of sensor array 
signal processing, y-j denotes the j-th snapshots, and we refer 
J as the number of snapshots. 



To analyze belief propagation (BP), we consider the factor 
graph G = {V,F,E) with variable node V ^ [N] ^ 
{1, • • • , N}, factor nodes F = [M], and edges E C {(to, n) : 
TO G [M],n e [N]} so that G is a bipartite graph with M 
factor nodes and N variable nodes. We let E = {{771,11,) € 
[M] X [N] : $.,„„ 7^ 0}. We consider a large system limit 
where e, a and J are constants, but the signal length N goes 
to infinity, and the number of measurements M = M{N) also 
goes to infinity, 

lim ^ . 5, 



b with both n-th and q-th elements omitted. Due to inde- 
pendence, p(b_„|6„) = p{h^n,q\bq,bn)p{bq), wc havc the 
following: 

p(x"|Y = Yo) cx i./„^x"(x")i.<,^x"(x"), 
1 

i^/„^x" := ^ fn{x'\bn)p{bn), 



N- 



N 



where J > in problems of practical interest. In this setting, 
we let d be a positive integer such that d < M and the 
elements of $ depends on d. Then we let d — > 00, so that 
we can utilize the central limit theorem to analyze the large 
system limit. For the measurement matrix $ and the edges E 
of factor graph, we assume the following: 
(Ml) The subgraphs GJ^ and G^ of the factor graph within 2i 
hops from factor node m and variable node n are trees, 
meaning that there are no local loops in the graph (for 
precise definitions, see ifTSl '). 
(M2) For all n e {!,■■■ ,N}, 

|{1<TO< A/:$™„^0}| = O((i) 

and for all to e {1, • • • , Af}, 

|{1 < n < : ^ 0}| = 0{d) 

respectively. Moreover, for any (m, n) E E, <i>mn — 
0{1/Vd) as d,N 00. 
(M3) For all factor nodes I in G*„, we have 



N 



N 



lim lim ^(ci>,„)2 = i, lim lim ^($,„)3 = 0. 

n—1 n—1 

For all variable nodes r in GJ^, we have 

M M 

lim lim (cl)^^)^ = 1, lim lim (^mr)'"^ = 0, 



d— ^00 A/— voo 



m—1 m—1 

III. Vector Belief Propagation 

Let X := [a;»j]ili(j=i and = [xy, • • • ,XNjf, 5(X) = 
piY = Yo|X), hih) = p(b), /„(x",6„) = p(x"|&„) for 
1 < n < TV and 1 < j < J. Then, to compute the MMSE 
estimate of x", we need p(x"|Y = Yo), which is given as 



p(x"|Y-Yo) 

J2 f p(X,b|Y = Yo) 



be{04}" 
1 



(X ^ p„(x", 6„)p(&n) / 5(X) n E ^) 

X ^ p(b-„|&9) 

b-„,,e{04}"-^ 

where X~" denotes the collection of {y^''}i<j<N, j=in, b_„ 
denotes b with the n-th element omitted and b_„ ^ denotes 



/ .9(x)n E 



55^^71 &„=0 



We let fJ,A^B{-) denote a message passed from node A 
to its adjacent node B in the factor graph. Extending the 
sum-product rule of belief propagation to the vector case, the 
messages can be represented by the following equations where 
we use the superscript (i) to denote estimated posteriors during 
iteration i: 



b„=0 



7(x)nE/'^(^''M<i/,(M- 



In general, if the measurement matrix is sparse so that 
the factor graph has local tree-like properties, then belief 
propagation produces the true marginal distribution of x" 
given the observations Yo [15|. For dense matrices, belief 
propagation shows some interesting optimaUty properties in 
the large system limit ifTOl . lfT3l . However, the complexity of 
evaluating marginal distributions grows exponentially in d so 
that exact belief propagation is not suitable for dense matrices. 



IV. Relaxed BP 



A. Derivation 



Guo and Wang's original work (15] presented an important 
results that the mean-square optimality of BP could be derived 
by a significantly simpler algorithm called relaxed BP. Relaxed 
BP overcomes the limitation of the BP by using a Gaussian 
approximation of the messages to minimize computation. 
Therefore, similar to Guo and Wang, we assume fg2_^x"(^") 
to be Gaussian under relaxed BP formulation. 

Suppose that the measurement matrix $ satisfies the con- 
ditions (Ml), (M2) and (M3). We let and r^(i) be 
the mean and covariance of (x'')^ with pdf v. 



(0 



at the 

i-th iteration, respectively. Since we have the following linear 
relation: 



2^ $„jgx« 

q^n 



for m = 1, • • • , M and n — 1, ■ ■ ■ , N, the Gaussian form of 



the message 



^^2^x"(x") (X A/;7(<i>™„(x")^^;z;r(*),S™(*)) . (1) 



(0 

g„->x' 



is represented as 



Here, owing to the assumption that the rows of X are 
independent and z™ has zero mean, we can easily derive: 



since Tl^{i) is the error variance of with the pdf k 
Now, we want to identify the pdf of the message 
to the sum-product rule, the message fi. 



(i+l 



q^m 



(2) 
(3) 

Due 

(x") is given by 
(x"). (4) 



We already know that individual messages ^'g^'L^x" (x") within 
the product are Gaussian. Hence, the product is also Gaussian. 
Hence, the pdf of the message is given by 



« [6AA^((x")^;0,A) + (l-e)5((x")^)] 



(5) 



where 
and 



(0 



which are calculated by using the following formula: 
]^A/'j(x,m„S,) cx AA/(x,m,S) 



where 



S = 



9 



H -1 



L 9 



where S„ 



, and 



~n 



(0) 



= t 



rnn 
7nn 



t 



- tm„(i))w™„(i)w^„(i) 



(z) U-^ + {ti{z)y' 



with 



= A 



where 



1 



(A-i-(s+r)-i)e 



(6) 



The update rule of relaxed BP is still complicated due to 
edge dependence of the messeges. In particular, most of the 
computational overhead comes from the calculation of 
dependent S„ due to the matrix inversion. Recall that S„ 
denotes the variance of the accumulated error from individual 
messages v'^}^g^ ^ n. Our goal is therefore to derive a 
edge independent relaxed BP algorithm that removes the de- 
pendency of n, m in S„ using the relaxed belief propagation 
in the large system limit. For this, we need some extensions of 
the law of large numbers and the central limit theorem, which 
are given in [12J. Using these results, we can now remove the 
edge dependence of the message passing rule for relaxed BP 
as shown in the following theorem. 

Theorem 1. Consider the relaxed BP where $ and (m, n) G 
E satisfy (Ml ), (M2) and (M3)for some fixed iteration number 
i > 2. Then as N,d cxd, we have: 



lim e'^^ik) 



lim 



C(fc) 



Sr„(A:) 



- X + Z(fc-1) 
^ AO(0,S(fc-l)) 

= S(A:) :=a2/+ir(fc) 



and 



lim 

d,N~^oo 



(7) 
(8) 

(9) 
(10) 



Using Lemma □ in Appendix A, we have the following k < i, where XandZ{k-l) has pdf := d\fj{^;0, A) + 



message passing rule for relaxed BP. 



C(0 = 



y(C(*-i);s:(*-i)) 

El*" 

q^m 



^ mq I 



(1 - e)(5(x) and J\fj(Q, S(fc - 1)) and 

r(fc) :=s[v^(x + z(fc-i),r(fc-i))], 

and = X := E{X), r^Jl) = T{1) := Cov(X) for all 

m and n. 

Proof: See Appendix B. ■ 
When the measurement matrix is sparse, in the large system 
limit, if the average degree d grows as o(Af ^/'^^'^^), then 
there is a so-called asymptotic cycle-free property [15 1. That 
is, the possibility of existence of a cycle of length shorter 
than k approaches zero. Hence, in the large system limit. 



the assumption (Ml) in the above theorem is asymptotically 
correct if o? = o(M^/'^'"^)). Under this condition, the edge 
independence of proved in the above theorem lead us to 
replace the message passing rule for relaxed BP as 



where 



(11) 
(12) 

(13) 
(14) 

(15) 



where 



(m,n)e-E 

V. Approximate Message Passing for MMV 

Recently, Donoho, Maleki and Montanari |fT3l developed 
the approximate message passing (AMP) for single measure- 
ment vector(SMV) problem y = €>x, which shows significant 
advantages over the conventional iterative thresholding algo- 
rithm, while achieving similar performance to basis pursuit. 
The AMP was developed within the belief propagation frame- 
work. In order to execute the belief propagation (or relaxed 
belief propagation), we must keep track of 2MN messages, 
but in applying AMP, we just need to keep track of Af + 
messages so that AMP reduces computation. AMP is more 
suitable to large-scale applications, whereas basis pursuit often 
demands too much time. 

To derive AMP for MMV, we let 



C(0 

C(0 



0(l/d), 
-0(l/d), 
-0{l/d). 



(16) 
(17) 
(18) 



Substituting into ([T3]l, ([T7]i into ([T4]i, and (O into ( fTTT i, 
respectively, we have the following results. 

Theorem 2. For the given signal model (S1)-(S3) and the mea- 
surement model (M1)-(M3), the approximate message passing 
algorithm for multiple measurement vectors is given by 



M"(* + l) 



M 



1=1 



N 



^(^ + l) 



+t„(*)(A-i + (S(z))-i)-i] 



TV 



a' I 



(23) 



and tn{i) = and w„(i) = (A + 

S(i)-i)-i(S(i))-i0"(i), and T]'{e\i);'E{i)) denotes the 
derivatives of •q{0'^ [i]] with respect to 0'^{i), respetively. 

Proof: See Appendix C. ■ 

VI. Case Study: Uncorrelated Snapshots 

The AMP update rule can be further simplified when 
the input source vectors {x^jj'^]^ are uncorrelated to each 
other. This scenario is the most optimistic in estimating the 
sparse support since rank(X) determines the upper bound of 
maximal sparsity fT). 

More specifically, consider the signal model (S1)-(S3) with 
A = I, where I denotes the identity matrix. In this setting, 
— c(i)I and r(i) — 7(1)! so that we have 

+ 1) = v{0-{^),cm = ^«(^)— ii- 

I + C(l) 

where the shrinkage operator t„(i) = c(i)I) is given 

by 



1 



+ i^(l + ,-i(,))4exp{-J^^} 



Fig. [U plots the shrinkage operator output with respect to the 
normalized input value, ||0"(i)|p/J, for various J param- 
eters when c{i) = 0.1. As J increases, it clearly exhibits 
a hard-thresholding behaviour with the threshold value of 
c(i)(l + c(i)) log(l + c^^(i)) (see Appendix D for proof). 
Thanks to the hard-thresholding behavior, the AMP update 




(19) 



(20) 

Fig. 1. Shrinkage operator for various number of snapshots for c(i) = 0.1 
and e = 0.1. 



^ ^* ^ ^-^ (y ) E/ "^""jA* (* + 1) (21) J.^Jg ^^jj further simplified. First, we can easily see that 

t{i){l — t{i)) 0. Therefore, we have 



9=1 
AT 



-z™(z)E^'(e'(0;SW)$^, (22) 

9=1 



7(i + l) 



c(z) 



l + c{i) 



(24) 



where 



VIII. Conclusion 



ci^) 



+ (25) 



51 + c{i-l) 



1 ^ 



(26) 



which denotes the ratio of the row whose I2 norm exceeds the 
threshold. In a large system limit as — ^ 00, Appendix E 
shows that e(i) — E[tn{i)\ — e. Therefore, the corresponding 
state evolution is 



o e c(i) 



S l + c{i) 



(27) 



The following theorem provides an important observation for 
the convergence of the state evolution. 

Theorem 3. In noiseless case, c{i) converges to regardless 
of the initial condition if and only if e < S. 

Proof: See Appendix F. ■ 
Theorem [3] informs us that the minimum undersampling 
ratio for AMP convergence approaches the sparsity rate e as 
the number of snapshots increases. Considering the existing 
results O stating that e is the minimum sampling rate we can 
achieves, AMP provides a computationally efficient framework 
to achieve the optimality. 

VII. Numerical Results 

Here, the experimental parameters are as following: M — 
50, N = 100, J = 3, e = 0.1 and d = 20. The sparse 
sensing matrix $ that satisfy (M1)-(M3) are generated by first 
drawing elements of { — 1,1} with equal probability, retaining 
the values with the probability of d/M, and scaUng by 1/^/d. 
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Relaxed BP with edge-independency 



10 15 20 25 30 35 40 

Iteration No. 



Fig. 2. Convergence of relaxed BP, relaxed BP with edge-independency and 
AMP 



Fig. |2] illustrates the normalized squared-error (NSE) for 
relaxed BP, relaxed BP with edge-indepedence, and AMP, 
when the signal correlation matrix A = I and SNR=30dB. 
The results in Fig. |2] clearly demonstrate that all algorithms 
converges to the nearly equivalent MSB value. 



We showed that a vector form of message passing is 
appropriate to describe belief propagation in MMV problem. 
Then, we adopted the idea of Guo and Wang to approximate 
the message as Gaussian pdf and provided a relaxed BP 
algorithm by only passing mean and covariances. It turns 
out that the resulting relaxed BP has an interesting shrinkage 
operator within the update as a function of norm of the signal 
row vector To reduce the computational overhead, we derived 
a rigorous condition for an edge independent covariance 
update for the relaxed BP. Finally, we derived the AMP 
algorithm that totally removes edge dependence even in mean 
update, which has complexity comparable to other iterative 
thresholding algorithms. Furthermore, using state evolution, 
we derived a sufficient condition for joint sparse recovery, 
which showed that the AMP achieves the optimality as the 
number of snapshot increases. 
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Appendix A 
For the calculation of mean and variance of v. 



(i) 



, we 



need the following lemma. 

Lemma 1. Suppose that a random variable x G M'^ has pdf 
/(x) as 



/(x) (X 



(ztt) 2 |A| 2 
exp (-i(x-0)^S-i(x-6>)) 



whose mean is 

i?(x) = S)0 

and the covariance is 

Cov(xx^) 
= £;(xx^) - E{x)E{x^) 
= T(0;S)(0<^^ + (A-1h 



/or some 9 £ M.'\ then we have the fallowings: 



i?(x) 
Cov(x) 

where 



Appendix B 

We need the following two theorems to prove the claim. 
Theorem 4 (Law of Large numbers lfT2l ). For each N and 



tie- S)(</)0^ + (A^^ + S"^)"^) - tie- Hfcbd)^' " '^'■'^^ """^ G M"', I = 1, • • • ,mbe a set 

' ' o/ independent random variables satisfying 



lim lim x^ , 



2^ where X denotes a random variable with a pdf fx.{x), which 

' '-, m 1 1 laTi-r-i tvTaT^TTo • denotes the distribution of limiting random vector x and 

1 + A + S 2/ S 2e"2» -(s+A) )e ^ j s _ ^ _ 

<^ I — i[I\,a) — {I,-- - , m| IS any deterministic sequence. 

Proof: Let Here, X ^ Y denotes that two random vectors X and Y 



exp(-^x^A ^x) 

« ,0 A^IAI^ + " 

(27r) 2 |A| 2 
exp(-i(x-6»)^S-i(x-0)) 
(27r)^|S|5 

' exp(-^x^A-^x- l(x-6))^S-^(x-6))) 
' (2^)'^|A|^|S|^ 

+ 1 - £ % J ^5 X 

(2^)i|S|5 



have the same distributions. Let a% ^ be a set of non-negative 
deterministic constants such that 6^ ^ — 0{1/Vd) and 

= 1. 



Then 



lim lim > air . 



lim lim ^a^X,.~^(X)- 



Since 



x^A-^x + (x - 0)^S-i(x - 0) 

(x - <f}fs-\^ -</.) + e^A^e, 



(28) 



where 

H^(A- 



A + S 



By plugging in Eq. ( |28] ) into the probability density function 
/(x) and integrating out, we have 



Theorem 5 (Central Limit Theorem fTzj)- x^r ^ be as in 

Theorem^such that for any deterministic sequence of indices 
i = i{N, d) G {1, • • • , to}, we have the limit 

lim lim \Q\E{^% i) - E{X)\ = 0. 

d—>oo N^oc 

Also suppose that afj ^ be a set of non-negative deterministic 
constants such that af^ ^ = 0{1/Vd) and 

rn m 

lim lim \a%f = 1 and lim lim ^(a^,,)^ = 0. 



i=l 



Then 



(27r)t|A|5 



-exp 



— I — - exp l^l-e'^'E-^e 



The resulting pdf f{x) is then given by 



(27r)7 |H|i 



^ — "I"^ "I" 

S| 2 /|A| 2 



<5(x) 



1 



lim lim ^a^,,(x^,,-i?(X))^AA(0,var(X)). 

i=l 

Proof of Theorem 2: First, by applying (M2) and (M3), we 
can easily see that ( fTOl i holds for fc = r if (|9]l holds for k = r. 
Hence we will show that ([8]l, ^ and ( fTOl ) holds for fc = 1, 
and the claim holds for any fc < i by induction. By with 
Tlil) = r(l) = Cov(X), 

s:r(i) = <^'/+Ei*"«i'r(i)- 

9#n 



By the assumption (M2) and (M3), we can easily see that = Cov[(X - 7?(X + Z(r - l);S(r- 1)))] 

^ <J^I + (l/<5)r(l) as d, ^ oo. Also, by © and = E\V(X + Z(r - 1): S(r - 1))1 = r(r). 

/i;^(l) = S(X)=x, wehave 



^, By using the central limit theorem on (I29t . we have 

^"^^ _ limd.Ar^ooC(^ + l)--A0(0,tT2/+(l/J)r(r)). 

q£N{ni)\{n} Finally, we show that if ^ and (|9]l holds for k = r, ^ 

$mq(x^)"^ + (z™)^ — E^ $„iq/x^(l) hoMs for fc = r + 1. By the induction hypothesis, we have 
geAf(m) 9eJV(m)\{«} for all (m, n) e -E. By the assumption (M3), 

we have 



= $,^„(x«)^ + (z™)^ + *™,((x^r-i). 

geA'(m)\{n} 

By the assumption (Ml), the terms in the sum of ( |29] l are 
independent and (M2) makes the first term disappear so that 
by the modified central limit theorem and the condition (M2), 



J2 l^inlHKir))-' 



z™(l) ~ AO fo,a2/+icov(X-x)) ^ E 



= A/;7(0,ct2/+ ir(l)). as (i,iV->cx). Letting e'(r) = zj,(r)-$,„(x")^, by (M2) and 

XT u .u . pti u ij r 7 ^ J jrn. the assumption, we have Cov[e'(r)l ^ S(r — 1) as d, — > 

Now, we show that if (mi holds for fc = r, then dSb and d9]i r,,, , l v ;j v / ) 

1 ij ^ 7 n > J r r, / \ J T^n / N OO- Then we have 

holds for fc = r. By the dehnition of At„i(f) and F„^(r), we 

have Hm CM = Hm 'i'^^e'W + ^ '^Llx-f] 



lim KJ,(r)^,7(X + Z(r-l);S(r-l)) 



and 



lim C(r) - 1/(X + Z(r- l);S(r- 1)) by using (M2) and (M3). By the central limit theorem, 

CM-x + z(.-i). 

where X and Z(r-l) has pdf/x := e7V,7(x; 0, A) + (l-e)(5(x) 

and J^j{z{r — 1); 0, S(r — 1)), respectively. Note that APPENDIX C 

^ll'ir) = cr^I + Y^ |4>„,,pr^(r). Substituting dH into (O, we have 



M 



By the assumption that G„i„(r) is a ti-ee, the terms in 9l\{i) = $;n[z'(i) + Szl^{i)] ^ $;„[z'(z) + fa^(z)] 
the above summation are statistically independent. Since i=im i=i 

lim<j,Ar^oo r;:,(r) ^ y(X + Z(r - 1); S(r - 1)), by the law -$„„z'"(i) + 0(l/d) 

of large numbers in Theorem |4] 

so that the followings hold: 

lim K'ir) ~ a^I+-E[V{X + Z{r-l);^{r-l))] 
= al+-r{r). 

Next, we consider z;^'(r). Note that = -$,„„z'"(z). (30) 

z™(r) = (y'")^ — E *^ ('^^ Similarly, substituting (fTTl i into ( fT4l i. we have 

= z™ + <i>™„M™w + E'^'™?((^') «^ 

By the assumption (M2), <^>„^nKnir) = 0(1/Vd) ^ as " ^^"^^ - E*""?/^'(*) + + ^'(l/c') 

d, N ~> oo. Furthermore, we have ^~ 

Um ((x'')^ - f^lir)) ^ X - ,(X + Z(r - 1); S(r - 1)) ^° 

a,N—^oo jV 

so that z™(i) = (y'»)^-E'i'-9M'W (31) 

^_lim^Cov[(x^)^-/x?„(r)] ^^„.(^) ^ $_M"(z)."' (32) 



Finally, substituting ( fT6] l into ( fTTT i. we have the following APPENDIX E 

Taylor series expansion Using Eq. d?) in Theorem [T] in a large system limit, we 

- 77(0"W;S(z))0"(*)-77'(0"(z);SW)<l>„,„z'"W X + Z(z - 1) (37) 

where X and Z(fc - 1) has pdf /x eA/^x; 0, A) + (1 - 



so that 



/x"(z + l) = 7?(0"(z);SW) (33) 
<5/x;^(z + l) = -77'(0"(z);SW))<i>,„„z™«. (34) 



Hence is updated according to 

M M 

(=1 (=1 

M 



e)(5(x) and7Vj(0,S(i-l)). Since the two RV's X and Z(i - 
1) are independent, the corresponding pdf can be therefore 
derived by convolving the two pdfs, providing us 

iifl"(i)ip iifl"(-i)ir 



e 2(i+c(,)) e 
- + (1 - e) 



(2^)*(l + c(i))5 



(27^)*c(^)^ 



(38) 



(=1 



which can be derived using the similar techniques used 
in Lemma □ As iV ^ oo, E^i ^ 
i? Using dSll and (|38] |, we can easily see that 

(35) tn{e'"{i))f{ff\i)) = ee"5fwT7/(27r)*(l + c(i))i so we 
have 



in the large system limit by ( |29] l and 

z''"(i + 1) is updated according to 



and (M3). Also, 



Z™(l + 1) 



N 



e 2(i+c(i)) 



(2^)t(l + c(i))3 



9=1 

TV 

(y™)T _ ^ ci>„J;.'^(^ + 1) - S(*))<1>™,Z™(*)] 
9=1 

TV Af 
\T \ " 



This concludes the proof. 



9=1 



+ 1) + E rl'{0'{^)■,c{^)X,^'^{ 

9=1 



Appendix F 

Let us first characterize the behavior at the fixed point 
c{i) X of the state evolution. 



Here, "q'lO; S) in ( [34] l can be calculated using the first order 
derivative of (|6]l with respect to 0: 



-i\-i v-i 



77'(0;S) = T(0;S)(A-VS-O 
+ (A^i ' ^--l^-l^--l 

where S) denotes the derivative of t{6; S) with respect 
to 6: 



lA + SI 



Appendix D 



S1 + X 

For the noiseless case = 0, the fixed points corresponds 

the intersection of y — x and y — ^x/{l + x) for a; > 0. 

We can easily see that one of the intersection is a; = and 

the other depends on the slope of y = f ^/(l + at x = 0. 

, 1 , , Since the slope is e/S, we can easily see that there exist no 

Yj^ )^ 0t ( 0' S) (36) f / ' J 

' \ ^ 1^ \ > other intersections other than a; = when the slope is less 

than or equal to one, i.e. e/5 < 1. This is the optimal scenario 

since the resulting error becomes zero regardless of c(l). Next, 

to complete the proof, we need to show that the fixed point 

iteration Eq. (|27] | converges. This can be readily shown since 

c{i + 1) = fc(i)/(l + c{i)) < c{i) for e < S for all i > 1. 

Since the sequence c{i) is monotone decreasing and there exist 

a fixed solution c* — 0, the algorithm converges from any 

initialization. This concludes the proof. 



A)- 



,-ie^(s-i-(s+A)-i)e 



Note that 



lim (1 



\2e 2c(l + c) 



= lim e 

J^oo 



This value becomes when 



when 



> c(l + c)log(l + c-i); 1 
c(l + c) log(l + c~^), and oo otherwise. There- 
fore, due to the definition of the shrinkage operator, this 
concludes the proof. 



